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1 .   Introduction. 

In  today's  complex  world  an  understanding  of  the  impact  of  modelling 
assumptions  upon  optimum  military  strategies  derived  from  mathematical 
models  is  essential  for  the  determining  of  optimal  solutions  to  complex 
problems  of  international  significance.   In  this  paper  we  continue  the 
study  of  one  of  the  authors  on  the  effects  of  various  modelling  assumptions 
on  the  structure  of  optimal  tactical  allocation  policies  by  systematically 
contrasting  the  solutions  for  a  sequence  of  idealized  models.   These  combat 
models  are  too  simple  to  be  taken  literally  but  should  be  interpreted  as 
indicating  general  principles  to  serve  as  hypotheses  for  subsequent  higher 
resolution  studies  of  real  world  problems  via  computer  simulation  or  field 
experimentation . 

In  previous  papers  [34],  [35],  [36],  [37],  [38]  one  of  us  has 
studied  the  optimal  control  of  deterministic  Lanchester  attrition  processes. 
A  major  result  of  this  previous  research  was  that  optimal  tactical  alloca- 
tion policies  are  quite  sensitive  to  the  precise  nature  of  the  combat 
model  adopted,  even  as  to  whether  the  tactical  scenario  lasts  for  a 
specified  period  of  time  or  terminates  only  when  a  predetermined  "break- 
point" has  been  reached.   We  have  shown  [36]  that  whether  or  not  concentra- 
tion of  all  fire  on  a  single  enemy  target  type  is  always  the  optimal  fire 
distribution  policy  depends  on  whether,  for  example,  enemy  target  types 
undergo  a  "square-law"  or  "linear-law"  attrition  process  (see  also  [38]). 
In  the  paper  at  hand,  we  examine  the  effects  on  the  structure  of  the 
optimal  fire  distribution  policy  of  whether  combat  attrition  is  modelled 
as  a  deterministic  or  a  stochastic  process.   Although  there  has  been  a 
continuing  discussion  among  military  operations  analysts  about  the  relative 


merits  of  deterministic  versus  stochastic  combat  attrition  models  (in 
particular,  see  [4],  [9]),  there  apparently  has  been  no  systematic  attempt 
to  contrast  optimal  military  strategies  derived  from  such  different 
modelling  approaches. 

In  order  to  keep  the  impact  of  modelling  assumptions  on  optimal 
strategies  in  sharp  focus  and  also  for  reasons  of  mathematical  tractability , 
we  consider  a  simple  fire  distribution  problem  for  a  homogeneous  Y   force 
in  Lanchester  combat  against  heterogeneous  X  forces  composed  of  two 
types  of  weapon  systems.   Our  research  approach  is  to  study  the  same 
scenario  (prescribed  duration  battle)  using  a  deterministic  combat  attri- 
tion model  and  also  a  stochastic  one  and  then  to  compare  the  corresponding 
optimal  fire  distribution  policies. 

The  solution  to  the  deterministic  problem  is  obtained  using  modern 
optimal  control  theory  (see  [8],  [27]).   As  discussed  in  [37]  arid  [41], 
the  non-negativity  restrictions  on  the  force  levels  are  state  variable 
inequality  constraints  (henceforth  abbreviated  as  SVIC's)  and  require 
special  treatment  (appropriate  modification  of  the  usual  maximum  principle  ) 
when  active  (see  Chapter  6  of  [27],  [40]).   In  this  paper  we  shall  treat 
SVIC's  by  the  method  of  Speyer  and  Bryson  [32]  (see  also  [19],  [24])  of 
adjoining  an  SVIC  directly  to  the  return  functional  with  a  (Lagrange) 
multiplier  (see  [41]).   Unlike  the  corresponding  terminal  control  problem 
studied  in  [34],  however,  this  "solution"  requires  several  computer  assisted 
computations  for  implementation. 

The  solution  to  the  stocnastic  problem  is  obtained  using  the  well- 
known  dynamic  programming  approach  to  optimal  stochastic  control  [13],  [21], 


In  this  paper  we  employ  an  equivalent  statement  of  the  Pontryagin  maximum 
principle  [27]  commonly  used  by  engineers  in  the  United  States.   There  is  a 
minor  sign  difference  (see  p.  108  of  [8])  between  these  versions. 


[12].   The  basic  equations  of  optimality  (the  fundamental  functional 
equation  for  the  optimal  expected-value  function  (see  [12]))  are  developed. 
We  derive  analytic  solutions  to  these  equations  for  very  small  numbers  of 
combatants  and  thus  obtain  the  optimal  closed-loop  control.   As  is  the 
case  for  the  Lanchester  stochastic  process  (see  [9],  [20]),  a  general 
solution  for  arbitrary  numbers  of  combatants  has  not  been  obtained  for 
the  fundamental  functional  equation  (actually  a  system  of  differential- 
difference  equations) ,  although  solutions  for  specific  (small)  numbers  of 
combatants  are  readily  obtained.   Therefore,  we  have  used  finite-difference 
methods  to  generate  a  numerical  approximate  solution. 

The  body  of  this  paper  is  organized  in  the  following  fashion.   First, 
we  review  a  few  relevant  facts  about  the  Lanchester  stochastic  process. 
Then  we  state  the  two  optimal  control  problems  that  this  paper  compares. 
The  method  of  solving  the  deterministic  problem  is  outlined.   The  basic 
equations  of  optimality  for  the  stochastic  control  problem  are  developed, 
and  obtaining  an  analytic  solution  to  these  equations  is  discussed.   The 
use  of  finite  difference  methods  for  generating  a  numerical  solution  is 
described.   Then  we  compare  results  obtained  from  the  two  models  and  dis- 
cuss these  results.   The  implications  of  these  results  for  defense  planners 
and  military  operations  analysts  are  pointed  out. 

? -   The  Lanchester  Stochastic  Process. 

In  1914  in  the  British  journ^'  Engineering  F.  W.  Lanchester  [23] 
postulated  that  under  the  conditio'   of  "modern  warfare"  combat  between  two 
homogeneous  forces  could  be  described  by  the  equations 


See  [45]  for  a  discussion  of  the  assumptions  inherent  :'.n  (1).  A  further 
discussion  of  Lanchester-type  equations  of  warfare  can  be  found  in  [39]. 
Further  references  on  determinis  : .  Lanchester  formulations  can  be  found 
there  [39]  or  in  [11]. 


dx 

dt 


■  -ay, 


where  a,b   are  commonly  referred  to  as  the  Lanchester  attrition-rate 
coefficients  and  x(t) ,y(t)   are  force  levels.   During  World  War  II, 
B.  Koopman  suggested  a  reformulation  of  such  a  model  in  stochastic  form 
[25]  .   Subsequent  work,  on  stochastic  models  of  combat  attrition  has  been 
by  R.  Snow  [31],  R.  Brown  [6],  [7],  G.  Weiss  [44],  D.  Smith  [30],  and 
G.  Clark  [9].   The  stochastic  process  corresponding  to  a  model  like  (1) 
has  been  called  the  Lanchester  stochastic  process  by  B.  Koopman  [20]. 

Before  considering  the  optimal  stochastic  control  problem,  it  seems 
appropriate  for  us  to  review  a  few  results  for  the  Lanchester  stochastic 
process.   Consider  combat  between  a  homogeneous  X  force  and  a  homogeneous 
Y   force.   Let  us  model  this  combat  as  a  continuous  parameter  Markov  chain 
with  stationary  transition  probabilities  (see  pp.  188-189  of  [26]  for  a 
further  discussion  of  terminology) .   Let  M(t)   denote  the  (integer) 
number  of  X  combatants  "alive"  at  time   t  after  the  battle  begins,  and 
let  N(t)   denote  the  number  of  Y   combatants.   We  denote  the  state  proba- 
bility by  P(t,m,n),   and  thus 

P(t,m,n)  =  Prob[M(t)=m,N(t)=n] . 

Making  standard  assumptions  (see  [5]),  we  find  that  the  state  probabilities 
satisfy  the  following  system  of  differential-difference  equations 
for   1  £  m  a   m   and  1  £  n  £  n„ 


Random  variables  are  denoted  by  capital  letters,  while  their  realizations 
are  denoted  by  the  corresponding  lower  case  letters. 

We  adopt  the  convention  that  P(t,m,n)  ■  0   for  either  m  >  m_   or  n  >  n  . 


^-(t,m,n)  =  P(t,m+l,n)A(m+l,n)  +  P(t,mln+l)B(m,n+l) 

-{A(m,n)  +  B(m,n)}P(t,m,n),    (2) 

where  m    (n~)   is  the  number  of   X   (Y)   combatants  at  the  beginning  of 
battle  at   t  ■  0,   i.e.   M(t-O)  =  m   with  probability  one;  A(m,n)   is 
the  rate  of  attrition  of  the  X  forces  with  A(0,n)  *  0;   and  B(m,n) 
is  the  rate  of  attrition  of  the  Y  forces  with  B(m,0)  *  0.   In  other 
words,  we  have 


Prob 


one     X     casualty   in  time  .  ,        *.. 

,    <-  ■  A(m,n)At. 

iterval   from     t      to      t  +  AtJ  ' 


J.nt 

(Moreover,  P(t,m,n)   is,  more  precisely,  the  transition  probability 

M(t=0)=m" 


P(t,m,n)  =  P(t,m,n;t=0,m  ,n  )  -  Prob 


|M(t)=m 
|N(t)=n 


N(t=0)=n^ 


.) 


Of  course,  the  state  space  is  discrete,  i.e.   m  ■  0,1,..., m   and 
n  =  0,1,..., n„.   At  state  space  boundaries,  i.e.   m  =  0  or  n  =  0, 
equation  (2)  takes  the  form 

^•(t,m,0)  =  P(t,nrt-l,0)A(m+l,0)  +  P(t  ,m,l)  B(m,l) 

-  P(t,m,0)A(m,0), 

dP 

— (t,0,n)  =  P(t,0,n+l)B(0,n+l)  +  P (t ,l,n)A(l,n) 

-  P(t,0,n)B(0,n), 

HP 

j-(t,0,0)  =  P(t,l,0)A(l,0)  +  P(t,0,l)B(0,l).  (3) 

at 

Initial  conditions  for  (2)  and  (3)  are 


.   1   for  -n  -  m  ,   n  =  n  , 
P(t=0,m,n)  =  \ 

0  otherwise. 


(4) 


Let  us  adopt  the  following  terminology  for  the  attrition  rates 
(and  hence  the  process  itself) .   We  say  that  we  have  a 

(a)   linear-law  attrition  process  when 

A(m,n)  =  amn, 

B(m,n)  =  bmn,  (5) 

and    (b)   square-law  attrition  process  when 

A(m,n)  ■  3m  +  an, 

B(m,n)  ■  bm  +  an,  (6) 

where  a,  8  may  be  referred  to  as  operational  loss  rates. 

Although  it  is  well-known  that  (2)  through  (4)  yield  an  exponential 
solution  (the  Chapman-Kolmogorov  equation  expresses  the  semi-group  property 
of  the  state  probabilities  (see  [20]))  when  A(m,n)   and  B(m,n)   have 
been  specified  (for  example,  by  (6)),  general  solutions  which  apply  for 
all  values  of  m_   and   n„   have  only  been  obtained  to  this  system  only 
in  a  few  special  cases.   In  the  special  case  when  a  +  a  -  b  +  3,   Isbell 
and  Marlow  [18]  developed  a  general  solution  to  (2)  through  (4)  for  a 
square-law  stochastic  attrition  process.   Recently,  Clark  (see  pp.  102-104 
of  [9])  developed  the  general  solution  to  the  linear-law  stochastic 
attrition  process  (i.e.   A(m,n)   and  B(m,n)   are  given  by  (5)). 

One  reason  why  we  have  reviewed  this  material  is  to  now  point  out 
to  the  reader  that  a  general  solution  to  (2)  through  (4)  only  exists  for 
a  linear-law  attrition  process  and  is  very  complex  (see  pp.  102-104  of  [9]). 
In  considering  the  optimal  control  of  the  Lanchester  stochastic  (square-law) 
process,  we  will  encounter  a  similar  system  of  equations  for  the  optimal 
expected-value  function.   Keeping  in  mind  that  a  general  solution  has  not 
been  obtained  to  the  corresponding  equations  (2)  through  (4)  for  the  state 


probabilities  of  the  square-law  stochastic  attrition  process,  the  reader 
will  not  be  surprised  to  learn  that  we  have  not  developed  an  analytic  solu- 
tion for  the  general  case  of  these  equations. 

Additionally,  using  the  above  noted  solutions  for  the  Lanchester 
stochastic  process,  Clark  (following  results  in  [25]  and  qualitative  results 
in  [31])  made  comparisons  [9]  (see  also  Chapter  11  of  [4])  of  the  average 
force  levels  in  the  stochastic  process  (denoted  as  m(t)   and  n(t))  and 
the  corresponding  force  levels  x(t)   and  y(t)   in  the  deterministic 
formulation  (such  as  (1)).   Unlike  the  corresponding  situation  for  the 
Yule-Ferry  linear  birth  process  (see  pp.  77-78  of  [3]  or  pp.  156-159  of 
[10]),  there  is  a  bias  (due  to  "boundary  effects")  in  the  dynamical  behavior 
of  x(t)   and  y(t)   as  compared  with  m(t)   and  n(t)   for  the  same  values 
of   a  and  b.   It  turns  out  that  m(t)   lies  above  x(t) ,   and  the  amount 
of  separation  grows  over  time. 

The  above  is  a  major  result  of  Clark's  careful  investigation  in 
which  several  numerical  examples  are  given  to  prove  such  points.   He  con- 
cludes that  (see  p.  11-19  of  [4])  "the  deterministic  model  would  have 
difficulty  approximating  a  stochastic  simulation"  with  respect  to  the  time 
history  of  force  levels.   Clark's  solution  to  the  stochastic  linear-law 
process  was  important  in  making  such  a  comparison.   This  fact  that  the 
average  of  the  Lanchester  stochastic  process  does  not  behave  identically 
to  the  corresponding  force  levels   x(t)   and   y(t)   computed  according  to 
the  corresponding  deterministic  model  has  motivated  the  paper  at  hand. 


3.   The  Optimal  Control  Problems. 

In  this  section  we  state  the  two  optimal  control  problems  that  are 

considered  in  this  paper.   The  deterministic  optimal  control  problem 

considered  is 

maximize{ry(tf )  -  px  (tf)  -  qx_(tf)}  with  t     specified, 
♦D(t) 

dxl 
subject  to:         -^-   -  -^a^y » 

dx2 

—  =  -(l-*D)a2y, 


dt     '   "blXl  "  W  (7) 


X;L,x2,y  *  0,       0  *  *D  *  1,       and   tf  £  t^, 


with  initial  conditions 


x1(t-0)  =  x°,       x2(t-0)  =  x°,       y(t=0)  »  yQ, 

where  all  symbols  are  explained  in  the  Appendix.   In  this  problem  x  ,x_, 

and  y  are  called  state  variables,  while  <J>   is  called  a  control  (or 

decision)  variable.   A  constraint  such  as  x  ^  0  is  called  a  state 

variable  inequality  constraint  (SVIC)  and  requires  special  treatment  (see 

below) . 

The  battle  lasts  for  0  £  t  £  t     unless,  of  course,  one  side  or 

max 

the  other  is  annihilated  before   t    .   To  be  more  precise,  the  battle 

max 

terminates  under  one  of  the  three  following  circumstances: 

(1)  xl(tf)  =  x2(tf)  =  0  and  t£  £  tMx, 

(2)  y(tf)  =  0  and  t£  £  e^, 

(3)  h   "  U' 


where  t,     denotes  the  time  at  which  the  battle  ends.  Upon  further 
analysis,  it  has  been  convenient  to  consider  that  there  are  eight  "terminal 
states,"  or  "target  sets."  These  are  shown  in  Table  I.  The  reader  should 
note  that  for  S,   through  S_  the  battle  ends  by  the  system  (as  described 
by  the  three  state  variables  x  ,x. ,  and  y)  being  driven  to  a  prescribed 
terminal  state.   For  these  terminal  states,   tf  is  undetermined  when 

tr  <  t   ,  since  it  is  then  determined  by  entry  to  the  terminal  state, 

f    max  '     '  ' 

and  this  depends  upon  the  control  used.  For  these  cases  a  well-known 

transversality  condition  must  hold.   The  above  problem  (7)  is  called  a 

prescribed  duration  battle,  since  the  battle  lasts  for  a  maximum  duration 

of  t   ,  i.e.   t£  as  t 

max         f    max 

The  corresponding  stochastic  optimal  control  problem  considered  is 

maximize  E[rN(tf)  -  pM  (tf)  -  qM_(tf)]  with  tf  specified, 

subject  to:   casualties  occur  randomly  as  a  continuous 

parameter  Markov  chain  with  stationary  transition 
probabilities  corresponding  to  the  deterministic 
process  (7) ,  (8) 


M ,  ,M2,N  ;>  0  and  0  «  f  £  1, 

where  the  random  variables  M  (t) ,  M  (t) ,  N(t)   are  force  levels 
(integers),   E[»]   denotes  mathematical  expectation,  and  all  other  symbols 
are  explained  in  the  Appendix.   In  (8)   <|>q  ■  <J>  (t,m.  ,m  ,n)   denotes  a 
closed-loop  control  (see  [16]).   For  the  deterministic  problem  (7)  we 
have  not  been  precise  about  this  point,  since  it  is  well-known  that  open- 
loop  control  (e.g.   <j»  =  $    (t;x. ,x_,y0))   and  closed-loop  control 
(e.g.   <j>  =  k(t,x1,x  ,y))  are  equivalent  and  yield  identical  results  in 
trajectory  and  payoff  [16].   For  stochastic  control  problems  this  equiva- 
lence is,  of  course,  not  true  (see  [12]). 
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Table  I.   Definition  of  Terminal  States  for  Deterministic 
Optimal  Control  Problem  (Prescribed  Duration 
Battle) . 


Sl:   xl(tf)  >  0,   x2(tf)  >  0,    y(tf)  >  0,    tf  -  t^ 

S2:  xx(tf)  =  Xl(tl)  =  0,   x2(tf)  >  0,   y(tf)  >  0,   tf  -  tmax 

where  t   <  tf 

S3:  xl(tf)  -  xl(t3)  >  0,   x2(t£)  =  0.   y(tf)  >  0,   tf  -  tmax 

where   t«  <  tf 

S4:  xl(tf)  >  0,   x2(tf)  >  0,   y(tf)-0,   tfstmax 

S5:   Xl(tf)  *  Xl(tl)  =  °»   X2(tf)  >  °»   y(tf)  =  °»   tf  *   'max 

where  t^    <   tf 

S6:   xl(tf)  =  xl(t2)  >  0,   x2(tf)  -  0,   y(tf)  =0,   tf  £  t^ 

where   t„  <  tf 

S?:   xl(tf)  -  Xl(tl) --  0,   x2(tf)=0,   y(tf)  >  0,   t£  *  t^ 

where  t  <  t 

Sg:   xl(tf)  =  0,   x2(tf)  =  x2(t4)  =  0,   y(tf)  >  0,   tf  *  t^. 

where  t.  <  t, 
4    f 
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4 .   Determination  of  an  Optimal  Policy  for  Deterministic  Problem. 

In  this  section  we  outline  how  an  optimal  policy  (expressed  as  a 

closed-loop  control)  may  be  determined  for  (7) .   In  order  to  keep  the 

length  of  the  paper  at  hand  within  reasonable  limits  we  will  only  be  able 

to  highlight  the  main  points.   Details  which  are  available  elsewhere  in 

the  open  literature  will  be  omitted.   In  order  to  contain  the  length  of 

this  paper  the  entire  "solution"  will  not  be  given  here. 
4.1.   Outline  of  Solution  Procedure. 
Before  giving  our  solution  algorithm,  it  seems  appropriate  to  define 

some  terms .  We  have  then 

Definition  1:   By  an  extremal  path  we  mean  a  path  on  which  the  necessary 
conditions  of  optimality  are  everywhere  satisfied  (we  use 
the  work  everywhere,  since  we  take  the  class  of  admissible 
controls  to  be  the  space  of  piecewise-continuous  functions) . 

Definition  2:   By  an  extremal  control  we  mean  the  control  used  in  order 
that  the  system  follow  an  extremal  path. 

Definition  3:   By  the  domain  of  controllability  for  extremals  to  a  given 
terminal  state  we  mean  that  subset  of  the  initial  state 
space  from  which  extremals  lead  to  the  terminal  state. 

Definition  4:   By  the  synthesis  of  an  extremal  control  we  mean  using  the 
basic  necessary  conditions  of  optimality  to  explicitly 
determine  the  time  history  of  an  extremal  control  from 
initial  to  terminal  time  as  a  function  of  initial  conditions. 


Complete  results  in  a  form  suitable  for  numerical  determination  are  to  be 
found  in  Appendix  G  of  [43].   The  "solution"  occupies  twenty  pages  in  [43], 
and  this  should  explain  why  for  the  purposes  of  the  paper  at  hand  only 
representative  results  are  given. 
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Our  solution  algorithm  then  is  as  follows: 

(a)  an  extremal  control  law  is  developed  from  the  maximum  principle 

(which  must  be  modified  when  the  trajectory  lies  on  the  boundary  of 
the  state  space) ;  for  Lanchester  "square-law"  attrition  structures 
the  extremal  control  law  in  many  cases  depends  only  on  relationships 
between  dual  variables  (marginal  returns  from  destroying  targets) , 

(b)  for  each  terminal  state  an  extremal  control  is  synthesized  by  com- 
bining a  backwards  integration  of  the  adjoint  system  of  differential 
equations  with  the  extremal  control  law  and  corner  conditions, 

(c)  for  each  terminal  state  the  domain  of  controllability  for  extremals 
is  determined  by  forwards  integration  of  the  state  equations  using 
the  synthesized  extremal  control  from  (b) , 

(d)  the  solution  is  determined  at  this  point  for  regions  of  the  initial 
state  space  which  are  covered  by  only  (part  of)  the  domain  of  con- 
trollability for  extremals  to  one  terminal  state;  one  must  also  verify 
that  the  entire  initial  state  space  has  been  accounted  for,  since 
otherwise  one  may  have  overlooked  some  type  of  "singular"  surface, 

(e)  if  domains  of  controllability  overlap  so  that  for  a  point  of  the 
initial  state  space  contained  in  their  intersection  there  is  more 
than  one  extremal  leading  to  the  terminal  surface,  then  one  computes 
the  return  (or  payoff)  associated  with  each  extremal;  the  optimal 
trajectory  is  selected  from  the  extremals  by  comparing  these  values. 

The  above  solution  algox  _tnm  is  a  refinement  of  the  one  presented 
in  [34].   Let  us  make  a  few  remarks  about  the  application  of  this  procedure 
to  the  prescrived  duration  bat:?.'-  ?)        For  this  problem  we  may  think  of 


For  this  approach  to  work  it  is  essential  that  an  optimal  policy  exist  for 
(7).   This  has  previously  been  established  in  [37],  [41].   In  this  case 
one  of  the  extremals  must  be  ar  optimal  trajectory. 
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time  as  being  an  additional  state  variable.   On  the  other  hand,  for  the 

Isbell-Marlow  terminal  control  problem  [34]  time  may  be  considered  as  being 

a  parameter  and  consequently  was  eliminated  for  the  determinations  of  step 

(c)  above.   In  other  words,  for  the  Isbell-Marlow  problem  a  domain  of 

controllability  was  determined  by  inequalities  involving  the  three  state 

variables;  for  the  prescribed  duration  battle  (7)  such  a  determination 

involves  the  four  variables   t    ,   x, ,   x_ ,   and  y». 

max    1'   2        0 

For  the  prescribed  duration  battle  we  have  not  been  able  in  all 

cases  to  develop  analytic  expressions  at  step  (c)  in  the  above  algorithm 

as  we  did  for  the  terminal  control  problem  studied  in  [34].   Consequently, 

we  could  not  analytically  accomplish  steps  (d)  and  (e)  for  the  problem  at 

hand.   We  have,  however,  used  computational  methods  to  determine  the  optimal 

control.   We  have  expressed  our  "solution"  (partially  presented  in  the  next 

section)  so  that  given  a  point  P   =  (x-,x_,yn)   in  the  initial  state  space 

and   t    ,   one  can  determine  which  terminal  states  are  reached  by  extremals, 
max 

Thus,  we  can  determine  to  which  domains  of  controllability  P   belongs. 
Then,  using  the  extremal  control,  we  can  numerically  compute  the  return 
(or  payoff)  associated  with  each  extremal  and  select  the  optimal  policy 
from  among  a  finite  number  of  possibilities.   A  computer  program  was  written 
in  FORTRAN  to  do  the  above  and  computations  performed  on  an  IBM  360  computer. 

4.2.   Summary  of  Solution. 

We  have  applied  the  solution  procedure  of  Section  4.1  to  develop 
a  "solution"  in  the  sense  discussed  there.   Without  loss  of  generality  we 
assume  that   a  b   >  a  b  ,   i.e.   R  >  1.   There  are  two  cases  to  be  considered 

(1)   6  >   1, 
and       (2)   0  £  6  <  1, 
where   6  =  a^p/Ca.q)  . 
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For  Case  (1):   6^1,   the  domains  of  controllability  do  not  overlap 
each  other,  and  hence  extremals  extremals  are  unique.   The  extremal  control 
is  thus  the  optimal  control.   The  optimal  policy,  moreover,  may  be  expressed 
in  a  particularly  simple  form:   always  concentrate  all  fire  on  X1   while 
x.  >  0.   Further  details  on  domains  of  controllability  and  "event"  times 
are  to  be  found  in  Table  II  of  [43]. 

For  Case  (2)  :   0  s*  6  <  1,  some  domains  of  controllability  overlap 
each  other,  and  hence  extremals  are  not  unique  (in  the  sense  that  from  a 
point  in  the  initial  state  space  the  system  may  be  steered  along  any  one 
of  several  extremals  to  various  end  states  of  battle).   (See  [41]  for  a 
discussion  of  a  similar  case.)   Thus,  considerations  "in  the  large"  (i.e. 
step  (e)  of  the  above  solution  procedure)  are  required  to  determine  the 
optimal  policy.   Unfortunately,  explicit  analytic  expressions  are  not 
readily  obtainable  as  they  were  for  the  Isbell-Marlow  terminal  control 
problem  [34].   However,  as  discussed  in  Section  4.1  above,  one  can  use  the 
information  presented  in  Tables  III  of  [43]  (which  is  fifteen  pages  long) 
to  numerically  determine  an  optimal  fire  distribution  policy  for  any  specific 
set  of  model  input  parameter  values.   A  representative  sample  of  this  informa- 
tion is  given  in  Table  II. 

In  Case  (2)  the  optimal  fire  distribution  policy  cannot  be  expressed 

in  the  very  simple  form  as  in  the  first  case.   When  Y  wins  in  time  less 

than   t     (S_   for  which  the  optimal  policy  is  determined) ,  the  optimal 
max    7 

fire  distribution  policy  is  precisely  the  same  as  when   6  k  1.   However,  for 

all  other  cases  (i.e.  terminal  states   S1   through  S,)   the  extremal  policy 

1  o 

is  to  finish  the  prescribed  durf^ion  battle  by  firing  at  X  ,   regardless  of 
whether  or  not   X   has  been  annihilated.   This  differs  from  that  when  6^1. 
Thus,  we  see  that  force  levels  affect  the  optimal  fire  distribution  policy. 
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Table  II.   A  Representative  Part  of  the  Solution 
to  the  Prescribed  Duration  Battle  for 
0  &  6  <  1. 
(Nonrestrictlve  assumption:   R  >  1,   i.e.   a  b  >  a  b  ) 


Cf  *  'max 


1     for     Octet,      where     x,(r..)   -  0 
*  111 


Extremal  Control:   ^(t)  " 


0  for   t.  <  t  £  t. 


Domain  of  Controllability:   a  b  y2  >  g2  -  (b_x_)2 

albly0  *   "*  +  (R-1)<b2x2)2 

1    -li^iVo-^^V?^ 

t,  -  t.  +  — tanh  \ >  £  t 

f    1    I — r—       \  ,      o  c-  max 

•a2b2  b-x./R 

where  t  -  t  (S  )  -  t  (S  )   is  given  by 

(1)      for     a^y2   >   a2 


o 
x. 


i_  tsvs  -  *' +  ggl  -  v2 » 

/&T7  /IT7  y.  -  s 


t,  - 


11  11  'o 


(2)      for     a1b1y2   <   a2 


^ 1  _jnr„ f 


*?l  8  '  ^i  yo 

(3)      for     a^y2  -  s2 

t.    -  — —  *n{r^-o4 
1        v^bY       *Vl' 

NOTE:   for  0  £  6  <  r  -  /R(R-l)   optimal  paths  also  satisfy  (equality 
yielding  a  dispersal  surface) 
for  0  £  x°  <  (b2x°)/(kb1) 

u      2       „    2         R  JL     o[z2(R-l)  +  R]    .    .      ol2 
alVo  *  Rs     "  ?iV-  2P    ^  +  b2X2>   ' 

where  k   is  given  by  k  -  {z2  -  R(z-l) 2}/ (2R) . 
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4.3.   Development  of  Basic  Necessary  Conditions  of  Optimality. 

We  will  use  Speyer  and  Bryson's  approach  [32]   of  adjoining  the 
state  variable  constraint  directly  to  the  criterion  functional  with  a 

Lagrange  multiplier.   The  Hamiltonian  is  given  by  (see  also  [19]) 

HCt.x.p,^)  =  -p-j^^y  -  P2(1~'f)D)a2y  "  P3(b1x1+b2x2)  +  n1(t)x1  +  n2(t)x2'  ^ 

where 


n.(t) 


-  0  for  x.  >  0, 


;>  0  for  x  =  0. 


The  adjoint  system  of  differential  equations  for  the  dual  variables  is 

3T  "  "  ll^-E'O  -V3-VO.  do) 

3T--!i^t'i*£-V3-.Vt)'         (11) 

P3     3H        *     *  * 

IT  =  "  37(t^'£»*D)  =  VlPl  +  (1-*U)a2P2-         (12) 

Boundary  conditions  for  the  dual  variables  (also  frequently  called  trans- 

versality  conditions)  are  discussed  below.   When   t..  <  t    .   the  following 

f    max 

transversality  condition  also  holds 

H(t=tf,x,p.O  =  0.  (13) 

When  x  ,x  >  0,   the  maximum  principle  yields  the  extremal  control 
law  [34],  [41] 

/  1  for  v(t)  >  0, 


4>n(t)  = 


D 


0  for  v(t)  <  0,  (14) 


Taylor  apparently  is  the  only  person  to  apply  these  important  results  to 
variational  problems  in  operations  research.  See  [41]  for  discussion  of 
previous  applications. 
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where  v(t)  =  (-p  )a   -  (-p  )a  .   In  [34]  we  showed  that  there  are  no 
singular  subarcs  (see  Chapter  8  in  [8])  in  the  solution. 

Without  loss  of  generality,  let  us  consider  a  constrained  subarc 

on  which  x. (t)  ■  0   for   t-  £  t  £  tf      (and  x  ,y  >  0   for   t  <  t   ) .   Since 

dxl  * 

-7—  =  0,   the  control  is  clearly  <t>n(t)  =  0  for   t  £  t  £  tf.   The  require- 

ment  that  —  ■  0  yields  the  following  relationship  between  dual  variables 

on  the  constrained  subarc 


a1p1(t)  -  a2p2(t).  (15) 

The  multiplier  n,  (t)   is  determined  from  the  condition  that  t~{tt)    =   0> 
1  dt  ^09 

and  this  yields 

nl(t)  =  a (a1b1-a2b2).  (16) 

The  interpretation  of   ru  (t)  (see  [41]  for  a  further  discussion)  is  the 
rate  of  marginal  return  to  Y   for  keeping  x  =  0.   Thus,  (intuitively) 
Y   tries  to  annihilate  X   only  when  it  profits  him  to  do  so.   Further- 
more, the  requirement  that  r\    (t)  ^  0  when  x  =  0   for  a  finite  interval 
of  time  yields  that  we  must  have 


albl  *  a2b2*  ('17^ 

since  it  may  be  shown  that   p^(t)  >  0   for   t  <  tf.   The  nonrestrictive 
assumption  that   a  b   >  a.b    (i.e.   R  >  1)   implies  that  it  is  nonoptimal 
to  have  x  =  0   for  a  finite  interval  of  time. 

Furthermore,  when  the  necessary  conditions  of  optimality  are  expressed 
in  Speyer  and  Bryson's  format  [32]  (see  also  [19]),  the  corner  conditions 


The  development  of  (15)  requires  a  slightly  different  argument  when   t  =  t 
and   y(tf)  =  0.   See  [41]  for  a  further  discussion  of  this  point. 
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(see  pp.  125-126  of  [8])  take  a  particularly  simple  form  for  a  first  order 
SVIC :   the  adjoint  variables  are  continuous  across  all  corners  (both 
interior  to  and  on  the  boundary  of  the  state  space) .   In  other  words 

P(t")  =  p(t+),  (18) 

~  c    *-  c 

where  t   denotes  the  time  just  before  the  corner  (i.e.  a  left-hand  limit) 
We  also  have  that 

H(t  ,x(t  ).P(0,**(t"))  =  H(t,x(t),p(t\/(t)).     (19) 

C~C~C    DC  C"*C**CUC 

On  entry  to  a  constrained  subarc  with  X-.  (t)  =  0  for  t  £  t  £  t-,   (19) 
yields 


a1P1(t1)  =  a1P1(t1)  =  a2p2(t1)  =  a^Ct^).         (20) 

Let  us  finally  consider  the  boundary  conditions  for  the  dual 

variables  at   t  =  tf.   The  nonrestrictive  assumption  that  a..b   >  a  b„ 

yields  that  no  extremals  lead  to   Sft.   The  three  terminal  states   S  ,   S  , 

and   S   may  be  discussed  collectively.   In  all  three  cases  the  length  of 

the  battle  is  equal  to   t    .   Then,  according  to  the  results  presented 
^         max 

in  [42] ,  we  have 

for  S  ,   S  ,   and  S  : 

px(tf)  =  -p  +  vv        p2(tf)  =  -q  +  v2,    P3(tf)  =  r  >  0,    (21) 

where 
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=  0  for  xi(tf)  >  0, 


I 
1  ( 

unrestricted  for  x  (t,)  ■  0  and  x.(t)  =  0  (22) 


\>.     I    k  0   for   x,(tc)  ■  0  but   x_,(t)  >  0   for   t  <  t£, 
i  f  i  f 


for   t.  £  t  £  t-  with  t.  <  t£. 
l        f         if 

The  latter  condition  that,  for  example,  the  multiplier   v,   is  unrestricted 

when  the  system  is  on  a  constrained  subarc  for  a  finite  interval  of  time 

is  because  the  boundary  of  the  state  space  is  "absorbing"  (i.e.  the  state 

constraint   x  ^  0   essentially  acts  like  a  terminal  equality  constraint 

as  far  as  the  determination  of  boundary  conditions  for  the  adjoint  variables 

[42]).   If  there  were  replacements  in  the  model  (7)  so  that  the  boundary 

of  the  state  space  would  not  be  "absorbing,"  then  we  would  have  v.  £  0 

for  x. (t J  =  0. 
l  f 

For   S.,   Sr,   and   S,   the  duration  of  the  battle   t£   is  determined 
4    5         6  r 

by  the  terminal  equality  constraint  y(t,)  »  0  when  tr  <  t     so  that 
ny  f  f    max 

the  transversality  condition  (13)  yields   p„(t.)  =  0.   When  tc   ■  t    , 
J  \     *    *  r3f  f    max 

additional  analysis  is  required,  and  this  is  discussed  in  Section  4.4 

below.   Then,  again  according  to  the  results  presented  in  [42],  we  have 

for   S.  ,   Scl   and   S. : 
4    5         6 

Pl(tf)  -  -p  +  \>v        p2(tf)  =  -q  +  v2,    p3(tf)  =■=  0,      (23) 

where  the  multipliers   v.   for   i  =  1,2   are  again  given  by  (22). 

For   S?:   x;L(tf)  =  *2(tf)  "  °.   y(tf)  >  0,   tf  £  t^,   we  have  [8] 

Pl(tf}  =  "P  +  Vl»    P2(tf)  =  "q  +  V2'    P3(tf)  =  r  >  °»    (24) 

since   tf   is  determined  by  the  (equality)  terminal  constraints  x  (t,)  =  0 
and   x  (tf)  =  0.   Since  these  are  equality  constraints,  the  multipliers 
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v   and  v   are  unrestricted  in  sign.   Since   tf   is  unspecified,  the 

* 
transversality  condition  (13)  with  <f>  (tf)  =  0  yields  that  -p9(tf)a?y  =  0 

so  that  p„(tf)  =  0  and  v„  »  q.   The  condition  (15)  which,  in  particular, 

holds  at   t  =  tf  yields  that  p1 (tf)  «  0.   Thus,  we  have 

for  S7[x1(tf)  -  0  before  x2(tf)  -  0,   y(tf)   0]: 

Px(tf)  -  0,   p2(tf)  -  0,   P3(tf)  =  r.  (25) 

4.4.   Synthesis  of  Extremal  Control. 

For  each  terminal  state,  extremals  may  be  synthesized  by  combining 
the  conditions  which  must  hold  on  a  constrained  subarc  and  the  extremal 
control  law  (14)  with  a  backwards  integration  of  the  adjoint  equations  (10) , 
(11)  and  (12) .   The  boundary  conditions  for  the  adjoint  variables  given 
in  Section  4.3  and  the  corner  conditions  (18)  and  (19)  are  used  in  this 
backwards  sweep  process.   It  is  convenient  to  use  the  switching  function 
v(t)  =  (-p  )a  -  (-p  )a   in  synthesizing  extremals.   Using  (10)  and  (11), 
we  readily  find  that  for   t  <  tf 

^  =  p3(t)(-a1b1+a2b2)  <  0,  (26) 

since  p„(t)  >  0  for   t  <  tf. 

Details  in  the  synthesis  of  extremals  are  similar  to  those  presented 
in  [34]-[38],  [41],  and  [43],   and  hence  they  are  omitted.   The  treatment 
in  [37]  is  most  similar  to  the  problem  at  hand.   Details  for   <S  k  1   and 
for   0  £  6  <  1   are  different. 

There  are  two  interesting  aspects,  moreover,  that  we  encountered 
in  synthesizing  extremals.   These  are 


In  some  of  these  references  the  non-negativity  of  the  force  levels  (i.e. 
SVIC's)  have  been  treated  by  means  othex  than  Speyer  and  Bryson's  approach 
[8].   The  basic  principles  of  working  backwards  from  the  end,  however,  are 
the  same  in  all  applications. 
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(a)  when  0  s;  6  <  1   and  a  switch  in  the  target  type  upon  which  all  Y- 
fire  is  concentrated  occurs  without  the  annihilation  of  a  target  type, 
the  switching  time  depends  upon  the  initial  force  levels  and  possibly 
the  valuation  of  Y   survivors,  and 

(b)  when  P  =  (x  ,x9,yn)   is  such  that  when  6  <  1  an  extremal  leads 

S  S 

to   S.   (i.e.  we  reach  S.   with  a  switch  in  tactics)  with  t£(S.) 
4  4  f   4 

<  t    ,  we  can  possibly  also  steer  the  system  to  an  end  point  with 
max 

y(t  =t   )  *  0  without  violating  any  necessary  conditions  of  optimality 
Let  us  first  discuss  the  dependence  of  the  non-annihilation  switching 
time  on  force  levels  and  valuation  of   Y   survivors.   Such  a  switch  in 
fire  distribution  only  happens  for   <5  <   1 .   Let  us  compare  the  situations 
for  extremals  leading  to   S1   and   S, .   In  both  cases  we  have 

/  1  for  0  £  t  £  t   -  x  , 


♦D<o 


0   for   t  -  t  <  t  £  t  ,  (27) 


where   x  (t=tf— c.)  >  0.   It  is  convenient  to  introduce  the  "backwards  time" 

t   defined  by   x  =  tf  -  t.   Then  when   6  <  1,   we  have  ^(t)  =  0   for 

0  £  t  ^  x,   where   x,   denotes  the  backwards  time  of  the  first  switch  in 

fire  distribution.   For   S.  [x.  (t,)  >  0,   x0(t..)  >  0,   y(t,)  =0,   t,  <  t    ], 

4   j.   x         L      t  I         i    max 

it  may  be  shown  using  (10)-(12),  (14),  (23),  and  (26)  that" 

Tl(S,)  =  — - —  cosh^z,  (28) 

where     z  =    (R-6)/(R-l).      For      S^x,  (tj    :■   0,      x0(tj    >   0,      y(tf)   =  0, 

t„  =  t    1 ,   it  may  be  shown  that 
f    max 


Further  details  of  the  results  summarized  in  this  section  are  to  be  found 
in  [43].   To  keep  the  paper  at  hand  from  being  too  long,  we  have  omitted 
them. 
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A2b2 


where 


qv  a2 


(30) 


The  following  theorem  is  of  interest  (see  [36]  for  a  similar  result) . 
THEOREM  1:  Assume  that  R  >  1  and  6  <  1. 
Then, 

T1(S1}  <  T1(S4>- 

A  proof  of  Theorem  1  is  given  in  [43].   Furthermore,  it  is  readily  shown 

that   lim  t,(S.)  =  0.   Thus,  when  6  <  1,   the  switching  time  t  -  tf  - 

r-H-a> 
x, (S  )   along  extremals  leading  to   S-   explicitly  depends  on  the  value 

Y  places  upon  the  survival  of  his  own  forces.   The  higher  he  values  Y- 

force  survivors,  the  longer  Y  forces  concentrate  their  fire  on  X   when 

6  <  1.   For  extremals  leading  to   S,,   the  transversality  condition  (13) 

yields  that  Y-force  survivors  have  zero  value.   Intuitively,  we  see  that 

firing  longer  at  X   prolongs  the  length  of  battle  for  those  cases  when 

y(tf)  =  0,   since  ^ib.  >  a?b  .   However,  for  extremals  leading  to  S, 

this  is  not  an  optimal  tactic. 

Let  us  therefore  consider  the  case  when   t,  =  t     for  S. .   We 

f    max        4 

just  discussed  above  the  possibility  when  R  >  1  >  6   of  prolonging  the 
length  of  battle  along  an  extremal  leading  to   S,   by  firing  longer  at 
X  .   Using  (27) ,  it  may  be  shown  that 

(b1x1(tf-T1)+b2x2) 


y(tf)    =  y(tf— Ocosh/a-b-    x sinh/a^b      x,  ,  (31) 
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where 

and 

where  v   is  the  multiplier  corresponding  to  the  terminal  constraint 
y(tf)  =  0.   Then,  the  following  lemma  may  be  established  [43]. 

LEMMA  1:   Consider  an  extremal  leading  to   S,   with  y(tf) 

given  by  (31)  and   tf   defined  by  y(t  )  =  0.   Then 


:f 


r- <  0   if  and  only  if  a,b,y£  <  s2. 

Utr  1   1  0 


In  [43]  it  is  whown  that  by  increasing  the  implicit  valuation  of  Y 

survivors  (i.e.   v   in  (33))  the  length  of  battle  may  be  extended  until 

t..  =  t    .   However,  this  is  not  an  optimal  policy.   This  situation  in 
f     max  '  r  r  J 

which  a  special  case  (here   tr  =  t     for   S.)   requires  an  inordinate 

f    max        4 

amount  of  analysis  unfortunately  has  arisen  in  all  problems  that  we  have 
studied. 

4.5.   Obtaining  an  Optimal  Policy. 

After  extremals  have  been  synthesized,  domains  of  controllability 
for  extremals  may  be  obtained  as  shown  in  [34]  .   It  then  remains  to  apply 
steps  (d)  and  (e)  of  the  solution  procedure  given  in  Section  4.1.   A 
computer  program  written  in  FORTRAN  has  been  developed  to  assist  in  the 
determination  of  an  optimal  policy.   This  computer  program  does  the  follow- 
ing:  for  a  given  point  in  the  initial  state  space,  we  determine  to  which 
terminal  states  extremals  lead.   Then,  the  payoff  corresponding  to  each 
extremal  is  computed.   The  optimal  path  (and  hence  the  optimal  policy)  is 
readily  obtained  by  determining  which  extremal  yields  the  largest  return 
to  Y. 
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In  the  above  fashion,  the  optimal  fire  distribution  policy  may  be 
obtained  as  an  open-loop  control.   After  this  has  been  obtained,  it  is  a 
straightforward  matter  to  express  the  optimal  policy  as  a  closed-loop 
control.   In  doing  this,  it  is  convenient  to  cite  the  principle  of  optimality 
[1]  (a  special  case  of  Isaacs'  tenet  of  transition  [17]  (see  also  [2])), 
i.e.  every  subarc  of  an  optimal  trajectory  is  itself  an  optimal  trajectory. 

5.   Determination  of  an  Optimal  Policy  for  Stochastic  Problem. 

In  this  section  we  discuss  how  an  optimal  fire  distribution  policy 
(expressed  as  a  closed-loop  control)  may  be  determined  for  (8) .   Using 
the  formalism  of  dynamic  programming,  we  develop  the  fundamental  functional 
equation  for  the  optimal  expected  value  function.   This  is  a  sufficient 
condition  of  optimality:   a  control  which  leads  to  the  satisfying  of  this 
equation  is  an  optimal  policy  (see  [29]).   An  analytic  solution  is  developed 
to  the  fundamental  functional  equation  for  very  small  numbers  of  combatants. 
Finite  difference  methods  are  used,  however,  to  generate  a  numerical 
approximate  solution,  since  a  general  solution  (for  arbitrary  numbers  of 
combatants)  has  not  been  obtained  to  the  fundamental  functional  equation. 

5.1   Development  of  Fundamental  Functional  Equation. 

Let   S(x,m1,m  ,n)   denote  the  optimal  expected-value  function  (see 
[12]).   Then 


S(x,m  ,m  ,n)  =  maximum  E    [rN(x=0)  -  pM  (x=0)  -  qM  (x=0) ]  ,   (34) 

x        c.  ,   —  .    m )  i  J.  * 

where 


*se* 


the  system  state  is  m, ,m  ,n  at  time  t   (i.e.   M  (x)  =  m  ,  etc), 

*   is  the  class  of  admissible  controls  (i.e.   <}>   must  always  be 

1    2 
chosen  from  the  set  of  rational  numbers   {0, — rr. — 7— r-, .  .  .  ,1})  , 

n(x)  n(x; 
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T  =  t   -  t   is  the  "backwards  time"  from  the  end  of  battle  (which 


begins  at   t  =  0)  , 


in 


denotes  mathematical  expectation  given  that  m(x)  =  (m. (t) 
,t  ~  l 


m2(x) ,n(x))  , 
casualties  occur  in  a  random  fashion  between  t  and  tf. 
In  other  words,   S(x,m  ,m„,n)   is  the  maximum  return  that  we  get  on  the 

average  when  we  start  with  force  levels  m  ,m  ,   and  n  at   t  =  tf  -  x, 

* 
follow  an  optimal  policy  ^  (s,m  ,m  ,n)   (chosen  from  the  class  of 

admissible  policies   <t)   for   t  £  8  £  t- ,   and  casualties  occur  in  a 

random  fashion. 

We  consider  that  casualties  occur  as  a  Markov  process  with  discrete 

state  space  (or  discontinuous  Markov  process).   Specifically,  we  assume 

that 

(1)   the  attrition  process  is  a  continuous  parameter  Markov  chain  with 

stationary  transition  probabilities  corresponding  to  a  deterministic 
Lanchester  square-law  attrition  process;  this  is  equivalent  to 
assuming 

(a)  the  future  occurrences  of  casualties  depend  only  on  the  state 
of  the  system  at   t   and  not  on  past  history, 

(b)  the  transition  probabilities  depend  on  only  the  state  of  the 
system, 

(c)  [one  X   casualty 


.  [one  X,   casualt1 
Prob  ,   ,   1    .   ..  ' 
|_in  interval   At 

[one  X   casualty 
[in  interval   At 


=  4>a..nAt, 


=    (l-<J>)a2nAt, 


Pr 


.  [one     Y      casualty]  ,,  ,,         w. 

ob    .       .    .  ,  '        ■    (b.m  +b0m  )At, 

[in   interval      At  112   2 


where  $a   n   is   X   casualty  rate,  etc., 
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(d>  Prob 


more  than  one  casualty 
in  interval   At 


=  0((At)2), 

0(x) 


where  0(x)   denotes  dependence  on  x  such  that   lim 

x+0   X 
const. , 

(2)  the  Y-forces  have  perfect  information  as  to  the  state  of  the  system 
at   t  and  the  expected  casualty  rates, 

(3)  the  Y-forces  can  instantaneously  shift  fire  from  any  target  at  any 
time, 

(4)  the  length  of  the  battle  is  known. 
Then,  we  have 

state  variables;   M  (t) ,M  (t) ,N(t) , 
decision  (or  control)  variable:  <\>    , 

where 

,  .    rA   1    2      n(t)-l  .i 

To  be  more  precise  <\>      =  <j>  (t,m  ,m  ,n)   is  a  closed-loop  (or  feedback) 
control. 

To  develop  the  fundamental  functional  equation  for  the  optimal 
expected-value  function,  we  begin  by  considering  any  interval  of  "backwards 
time"  of  length  Ax  which  occurs  from  x  -  xA   to   x.   There  are  five 
exhaustive  and  mutually  exclusive  possibilities  for  random  events  to  occur 
in  such  an  interval.   These  are 

(1)  one  X   casualty  occurs, 

(2)  one  X   casualty  occurs, 

(3)  one  Y   casualty  occurs, 

(4)  no  casualty  occurs, 

(5)  more  than  one  casualty  occurs. 
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Let  us  now  examine  each  of  these  cases  and  develop  expected  returns. 

( 1 )  One  X   casualty  occurs  in  At : 

By  our  assumptions  above,  we  have  for  the  probability  of  occurrence 
of  this  event 

Prob[one  X   casualty  occurs  in  At]  =  <J>a  nAT . 

Given  that  one  X   casualty  is  realized  in  the  interval  from  t   to 
t-At,   the  optimal  fire  distribution  policy  for  Y  will  consider  the 
maximum  expected  value  for  the  return  functional  as  casualties  continue 
to  occur  randomly  from  t  -  At   to   t  ■>  0.   This  maximum  expected  value 
is   S(t-At ,m  (t-At) ,m  (t-At) ,n(T-AT) )   where  m  (t-At)  =  m  (t)-1, 
m-(T-AT)  =  m  (t) ,   and  n(T-AT)  =  n(i). 

(2)  One  X   casualty  occurs  in  At: 
Similarly,  we  have  that 

Prob[one  X«   casualty  occurs  in  At]  =  (l-<f>)a~nAT , 

with  the  optimal  expected-value  function  S (t-At ,m , (t) ,m  (T)-l,n(x) ) . 
Events  (3)  through  (5)  are  analyzed  in  a  similar  fashion. 

Now,  by  the  standard  dynamic  programming  argument  which  combines 
the  probabilities  of  events  (1)  through  (5)  above  with  the  maximum  expected 
return  to  be  achievable  given  these  events  occur,  we  obtain  the  expression 

S(t  ,m  ,m  ,n)  =  maximum{  [1-At{<J>  a  n+(l-<f>_)a  n+b  m  +b2m  }]S  (t-At  ,m  ,m  ,n) 
0^4>  £l 

+<(>  a  nATS(T-AT,m  -l,m  ,n)  +  (1-<|>  )a  nATS(T-AT,m.  ,m2"l,n) 

+  (.b^^b^  )ATS(T-AT,m  ,m  ,n-l)  }.     (35) 
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Rearranging  terms  in  (35)  and  taking  the  limit  as   At  -*•  0,   we 
obtain  the  fundamental  functional  equation  for  the  optimal  expected-value 
function 

for  m  ,m  ,n  >  0: 

^-(x,m1,m2,n)  =  (b^+b^)  {S  (x  .m^  ,m2>n-l)  -  Stx.n^.m^n) 

+  n  maximum[<J>   a    {S(x,m  -l,m   ,n)    -  S(x,m    ,m    ,n)  } 
0i<j>  si        b   l  L  l  l      l 

+    (l-<J>s)a2   S(x)m1,m2-l,n)    -  S  (x,m;L  ,m2  ,n)  }]  ,         (36) 

with  the  boundary  condition  at   t  ■  tf 


S(T=0,m1>m2,n)  =  rn  -  pm1  -  qm2>  (37) 


where  m  ,m  ,   and  n  are  integers  and 


*  =  {0,^,...,-^-,!}.  (38) 


n  n       n 

Special  forms  of  (36)  in  which  m  *  0,   etc.,  will  be  given  later. 

More  concisely,  we  could  have  said  that  (36)  results  from  combina- 
tion of  the  well-known  formalism  of  dynamic  programming  with  the  retrospective 
(backward)  probabilistic  evolution  of  the  system  over  time  (c.f.  [13],  [22]). 
It  should  be  noted  that  (36)  is  a  special  case  of  an  equation  given  by 
Kushner  in  1962  [21]. 

If  we  take  (36)  to  be  the  basic  equation  for  S(x,m  ,m  ,n) ,   then 
(35)  may  be  considered  to  be  the  simplest  finite  difference  approximation 
to  it,  i.e.  the  result  of  applying  the  well-known  Euler's  method  to  (36) 
(see  pp.  130-131  of  [15]).   (Of  course,  a  method  employing  a  higher  order 
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approximation  scheme  (see  pp.  132-140  of  [15])  may  be  necessary  under  many 
circumstances.)   We  will  find  this  point  of  view  convenient  when  we  consider 
developing  a  solution  to  (36) . 

Alternatively,  we  could  have  taken  a  discrete  parameter  Markov 
chain  as  our  basic  combat  model.   It  is  readily  shown  that  an  optimal 
policy  exists  for  this  latter  formulation  (see  Theorem  1  on  pp.  88-89  of 
[22]),  and  that  a  policy  which  yields  the  maximum  in  (35)  is  an  optimal 
policy  (see  Theorem  2  on  p.  89  of  [22]). 

5.2.   On  the  Analytic  Solution  of  the  Fundamental  Functional  Equation, 
The  first  task  in  determining  an  optimal  fire  distribution  policy 
(which  requires  obtaining  the  solution  to  (36)  and  (37)  is  to  develop  the 
entire  system  of  equations  (c.f.  equations  (2)  through  (4)).   We  must, 
therefore,  develop  the  form  that  (36)  takes  at  the  boundary  of  the  system, 
i.e.   m  =  0  or  m  =  0  or  n  ■  0,  where  the  fire  distribution  problem 
no  longer  exists.   When  n  =  0,   arguments  similar  to  the  above  lead  to 
for   n=0,   m^O,   m^O, 

j  C 

— (x.m^m  ,0)  m   0  with   S(T-0,m  ,m  ,0)  =  -n^p-n^q, 
and  hence 

for  n=0,m  2:0, m  ^0:   S(T>m  ,m  ,0)  =  -n^p  -  m2q.     (39) 
Similarly, 

for  m  =0,m  =0,n^0:   S(T,0,0,n)  =  nr,  (40) 

for  m  =0,m  >0,n>0:   — (x,0,m  ,n)  -  b.m  {S (x ,0,m2 ,n-l) 

-  S(x,0,mo,n)}  +  a_n{S(x,0,mo-l,n)  -  S(x,0,m  ,n)},      (41) 
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j  C 

for  m  >0,m  =0,n>0:   — (x,m  ,0,n)  =  b  m  {S(T,m  ,0,n-l) 

-  S(x,m1,0,n)}  +  a1n{S(T,m1-l,0tn)  -  S (x ,m  ,0 ,n) } .      (42) 

Equations  (36)  through  (42)  are  the  complete  system  of  equations  for  the 
optimal  expected-value  function  in  the  optimal  control  of  the  Lanchester 
stochastic  process. 

For  m  >  0,  m  >  0,   n>0  the  optimal  fire  distribution 
policy  is  determined  by  the  maximization  operation  in  (34) ,  and  hence 

.  1  for  W(x,m  ,m-,n)  >  0, 


<^s(x,m1,m2,n)  = 


0  for  W(x,m1,m2,n)  <  0,         (43) 


where  we  shall  refer  to  W(x,m  ,m  ,n)   as  the  "switching  function."   It  is 
defined  by 

for  m  >  0,  m  >  0,   n>0, 

W(x,m  ,m  ,n)  =  a  {S(x,m  -l,m  ,n)  -  S(x ,m1>m2 ,n) } 

-  a2{S(x,m1>m2-l,n)  -  S  (x.m^m^n)  }  .        (44) 

Let  us  observe  that  at  the  end  of  the  battle  at   t  =  tf>   we  may  combine 


(37),  (43),  and  (44)  to  obtain 

{1  for  axp  >  a2q, 
0  for  a  p  <  a  q,  (45) 

which  is  similar  to  results  for  the  optimal  control  of  the  deterministic 
process  (7)  (see,  for  example,  (14),  (21),  and  (22)). 

It  should  be  noted  that  equations  (36)  through  (42)  have  the  same 
form  as  those  for  the  Lanchester  square-law  attrition  stochastic  process 
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(i.e.  equations  (2)  through  (4)  when  the  attrition  rates  are  given  by  (6)). 
A  general  solution  has  not  been  obtained  to  these  equations.   Nevertheless, 
it  is  of  value  to  develop  a  partial  solution.   For  example,  since  we  use 
finite  difference  methods  to  generate  an  approximate  solution  (see  Section 
5.3  below),  it  is  desirable  to  check  the  adequacy  of  the  approximation  (in 
particular,  the  "time  step  size"  used  in  the  numerical  propagation  of  the 
approximate  solution  by  "marching  ahead  in  time") .   This  is  easily  done  by 
comparing  the  approximate  solution,  denoted  as  S,   to  the  exact  analytic 
solution,  denoted  as  S.   Hence,  a  partial  analytic  solution  is  useful. 

Careful  consideration  of  (36)  through  (42)  reveals  that  there  are 
restrictions  on  the  order  in  which  the  optimal  expected-value  functions 
S(t,ui  ,m  ,n)   for  m  =0,1,2,...,   etc.,  can  be  computed.   In  particular 
an  admissible  sequence  for  building  up  the  solution  through  S(x, 1,1,1) 
is  shown  below  in  Table  III. 


i 

i 

n 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

1 

1 

0 

0 

1 

1 

1 

0 

1 

1 

1 

Table  III. 

1 

Admissible  Order  for  Computing  Optimal  Expected-Value 
Functions  (admissible  order  is  from  top  to  bottom) . 


We  note  that  (36)  becomes  a  first  order  system  of  ordinary  differential 
equations  for   S(t,m  ,m  ,n)   when  <{>   as  determined  by  (43)  is  used.   Solving 
for   S(x,m1,m  ,n)   for  m  =  0,1,2,...,   etc.,  we  can  then  determine  <f>   by 
(43) .   The  synthesis  of  an  optimal  control  by  combination  of  the  control  law 
(43)  with  integration  of  a  system  of  differential  equations  is  similar  to 
that  for  deterministic  optimal  control  problems. 
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We  readily  successively  compute  using  (39)  through  (42) 


S(t,  0,0,0)  =  0,    S(x, 1,0,0)  =  -p,    S(t,  0,1,0)  =  -q, 


S(t, 0,0,1)  =  r,    S(x, 1,1,0)  =  -p  -  q, 


S(t, 0,1,1)  = 


b2r-a2q 


a2+b2 


-(a2+b2)T 
e  + 


a2r~h2q'] 


a2+b2 


S(t, 1,0,1)  = 


rbir~alP 


a1+b1  ; 


-(a1+bl)x 
e         + 


alr~blP 


al+bl 


(46) 


Using  (46),  equations  (36)  and  (37)  become  for  m  =  1,   m  =  1,   n  =  1, 


dS 


~(i,l,l,l)    =   -(b1+b2){S(T,l,l,l)  +  (p+q)}   +  maximum[4»sa1{S(T, 0,1, 1)-S(t, 1,1,1)} 


4>s=0  or  1 


+  (l-<|>s)a2{S(T>l,0,l)-S(T,  1,1,1)}],    (47) 


with 


S(t=0, 1,1,1)  =  r  -  p  -  q, 


where  S(t, 0,1,1)   and   S(x, 1,0,1)   are  given  by  (46). 

Using  (43)  ,  (44)  ,  and  (45)  ,  we  may  readily  solve  (47)  .   As  for  the 
deterministic  formulation,  there  are  two  cases  that  must  be  distinguished 

Case  (1)     a  p  ^  a  q, 

Case  (2)     a  p  <  a  q. 

For  Case  (1):   a  p  ;>  a  q,   we  have  that   <j>(x, 1,1,1)  =  1   for  0  £  t  £  i    , 
where  x,   denotes  the  "backwards  time"  of  the  first  switch  in  the  optimal 
fire  distribution  policy.   Thus   t..   is  the  smallest   t  which  satisfies 
W(t=t1, 1,1,1)  -0  with  W(x, 1,1,1)   given  by  (44). 
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for  0  £  t  £  t   when  a  p  ;>.  a^      (^(t, 1,1,1)  -  1) 

.,   1  1  n      ai(b2f-a2q)     -(a2+b2)x   ([^(bft^lr 
U,,,J  '  (a1+b1-a2)(a2+b2)    e         +|  (a1+b1-a2)(a1+b1+b2) 

i    axP  ala2q        I  -(a1+b1+b2)T 

e 


(a1+b1+b2)    (a1+b1-a2) (a1+b1+b2) 
axa2r         (b^p    [ (b1+b2) (a^+a^q  j  ^ 


(a2+b2) (a1+b1+b2)    (a1+b1+b2)      (a2+b2) (a1+b1+b2) 


We  note  that   x   might  be  equal  to  +°°,   i.e.  we  never  switch.   Assuming 
that  a  switch  in  targets  does  occur,  however,  let  us  denote  S(x=x  ,1,1,1) 
by  S   where,  as  we  recall,   x   is  the  smallest  x  which  satisfies 
W(x=x  ,1,1,1)  -  0.   Then,  we  have  that   <f>(x, 1,1,1)  =  0   for   x   <  x  £  x  , 
where   x   denotes  the  "backwards  time"  of  the  second  switch  in  the  optimal 
fire  distribution  policy.   Then,  we  have 

for   x   <  x  £  x   when  a  p  k  a  q   (<j>(x, 1,1,1)  =  0) 

./   iin      a2(blr-alp)    f  -(Vbi)T   (a2+b2)(Vx).a1xrb1T) 

S(T'1'1'1)  =  (a2+b2-ai)  (.1+bl)  \e  "£  J 


■WS.  - 


axa2r         [  (b^bj)  (a^b^+a^Jp    O^+b^q  |  (a^+b^  (x^x) 


0    (a1+b1)(a2+b2+b1)     (a1+b1) (a2+b2+b1)      (a^+b^  j 

,  f       alV         [(b^b2)(a1+b1)+a2b13p     (b1+b2)g  j 
((a1+b1)(a2+b2+b1)  '    (a1+b1) (a2+b2+b1)       (a^+b^  j  * 


Again,  we  note  that   x?   might  be  equal  to  -H»,   i.e.  we  might  never  redis- 
tribute fire  a  second  time.   Assuming  that  a  second  switch  in  fire  distribu- 
tion  does  occur,  we  have  <|>(x, 1,1,1)  =  1  for   x~  <  x  £  x...   We  have  not 
carried  out  the  computation  of   S(x, 1,1,1)   past   x„. 
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For  Case  (2):   a  p  <  a~q,   the  results  are  symmetric  to  the  above 
(interchange  the  roles  of  X   and  X  )   and  hence  are  omitted. 

Although  the  above  constitutes  a  complete  development  for   S(t,  1,1,1) 
(and  hence  <J>  (1,1,1, 1)   via  W(t, 1,1,1)),   these  results  are  complex 
enough  that  it  is  not  immediately  clear  how  ^(t,  1,1,1)   changes  over 
time  and/or  depends  on  model  parameters. 

5.3.   Development  of  Numerical  Solution. 

With  the  advent  of  modern  high-speed  digital  computers,  finite 
difference  methods  of  obtaining  an  approximate  solution  are  commonly  used 
when  an  analytic  solution  cannot  be  obtained  to  equations  like  (36)  through 
(42).   Euler's  method  (see  pp.  130-131  of  [15])  yields  the  simplest  finite 
difference  approximation  for  (36) .   Let  us  denote  the  approximation  to  the 
optimal  expected  value  function  as   S.   We  shall  compute  values  for  this 
approximation  at  discrete  points  in  time  separated  by  a  constant  amount 
Ax.   We  let   x  ■  £At   so  that   tf  =  LAt.   Then   (36)  may  be  approximated 
by 

for  m  >  0,  m  >  0,   n  >  0: 

S((A+l)AT,m1,m2,n)    =   {1-(At)  (b^  +b2m2)  }SUAT,m    ,m    ,n)   + 

(At)  (b  m  +b_m  )S(HAT,m1  ,m.  ,n-l)   +  n(Ax)   maximum[d>0an  {S  (£At  ,mn  -1  ,m_  ,n) 
112   2  12  Q^j  SI  12 

-  S(AAT,m    ,m2,n)}  +    (l-$s)a2{S  (£At  .m^n^-l.n)    -  SUAt^.i^ii))],  (50) 

for  I   =  0,1,..., L-l  with  boundary  condition  (37)  and  also  (38).   Similar 
approximations  may  be  developed  for  (41)  and  (42) . 


We  recall  that  for  the  deterministic  formulation  when  x  (t  )  >  0  and 
x2(t  )  >  0,   the  conditions   a  p  ^  a  q  and  a  b   >  a?b    implied  that 

<J>*(t,x  x  y)  =  1   for  the  entire  battle. 
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As  noted  above,  consideration  of  (36)  through  (42)  yields  that 
there  are  restrictions  on  the  order  in  which  the  optimal  expected-value 
function  S   (or  its  approximation  S)   is  computed.   The  computation  of 
S((4+l)Ax,m  ,m  ,n)   depends  upon  the  quantities  shown  in  Figure  1  below. 


m , -l,m  ,n 


m  ,m2-l,n 


m1,m2,n-l 


n^.n^*11 


Figure  1.   Dependence  of  Optimal  Expected-Value  Function 
on  Discrete  State  Variables. 


Based  on  the  dependence  depicted  in  Figure  1,  the  solution  can  be  "built- 
up"  as  shown  in  Table  IV. 

It  remains  to  discuss  the  adequacy  of  the  finite  difference  approxi- 
mation (50).   It  is  well-known  (see  pp.  130-145  in  [15])  that  Euler's 
method  yields  a  finite  difference  approximation  for  such  a  system  of 
differential  equations  that  is  both  consistent  and  stable  so  that  the 
approximate  solution  S   can  be  guaranteed  to  converge  to  the  exact  analytic 
solution  S   as   At  •*■   0   (and  L  ■>  °°)   [28].   However,   At   must  not  be  too 
large  in  order  to  keep  the  truncation  error  satisfactorily  small.   Moreover, 
the  time  step  size  At   is  also  limited  by  the  fact  quantities  like 
(At)  (b  m  +b  m  )   or  a  nAi   or  a„nAT   in  (50)  represent  probabilities  and 
hence  must  be  less  than  one.   In  our  computational  work  we  have  used  a 


A  computer  program  has  been  written  in  FORTRAN   for  this  purpose. 


Table  IV.   Admissable  Order  for  Computing  Optimal 
Expected-Value  Functions. 
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m. 


m. 


n 


m„ 


m. 


0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

1 

1 

0 

0 

1 

1 

1 

0 

1 

1 

1 

1 

2 

0 

0 

0 

2 

0 

0 

0 

2 

2 

1 

0 

1 

2 

0 

2 

2 

0 

0 

2 

1 

0 

1 

2 

0 

2 

2 

2 

0 

1 

1 

0 

2 

2 

0 

2 

2 

1 

1 

1 

2 

1 

1 

1 

2 

2 

2 

1 

1 

2 

2 

2 

1 

2 

2 

2 

2 

3 

0 

0 

0 

3 

0 

0 

0 

3 

3 

1 

0 

3 

2 

0 

1 

3 

0 

2 

3 

0 

3 

3 

0 

0 

3 

1 

0 

3 

2 

0 

1 

3 

0 

2 

3 

0 

3 

3 

3 

0 

1 

3 

0 

2 

1 

0 

3 

2 

0 

3 

3 

0 

3 

3 

1 

1 

3 

2 

1 

3 

1 

2 

3 

2 

2 

1 

3 

1 

2 

3 

1 

3 

3 

1 

1 

3 

2 

2 

3 

2 

1 

1 

3 

2 

1 

3 

3 

1 

3 

1 

2 

3 

1 

3 

3 

2 

2 

3 

3 

3 

2 

2 

3 

3 

3 

2 

3 

3 

3 

3 

A 

0 

0 

0 

4 
etc . 

0 

Note:   Admissible  order  is  top  to  bottom,  starting  with 


column  (composed  of 


V 


m2,  n) 


on  left, 
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time  step  size  which  yields  agreement  in  the  fourth  decimal  place  to  the 
right  of  the  decimal  point  when  S   is  compared  to  the  exact  analytic  solu- 
tion S   in  the  special  cases  (such  as  (48)  and  (49))  when  the  latter  has 
been  obtained. 

6.   Comparison  of  Results  from  Deterministic  and  Stochastic  Formulations. 

In  this  section  we  compare  the  structures  of  the  optimal  fire  dis- 
tribution policy  between  the  deterministic  control  problem  (7)  and  the 
stochastic  control  problem  (8).   Before  presenting  this  comparison,  it 
seems  appropriate  to  discuss  some  general  methodological  considerations. 

Any  comparison  between  the  two  models  should  be  guided  be  the  purpose 
of  the  comparison.   In  the  paper  at  hand  our  purpose  is  to  consider  whether 
the  structure  of  the  optimal  fire  distribution  policy  is  the  same  for  the 
two  formulations.   In  other  words,  we  would  like  to  determine  upon  what 
groups  of  model  parameters  the  optimal  allocation  rule  depends  and  whether 
this  depends  upon  the  particular  form  of  model  adopted  (here  deterministic 
or  stochastic) .   The'  things  that  can  be  compared  between  the  two  models 
are  (1)  the  optimal  fire  distribution  policy  and  (2)  the  optimal  (expected) 
return.   It  is  the  opinion  of  the  authors  that  the  second  criterion  (i.e. 
optimal  return)  is  only  significant  when  there  are  differences  between  the 
optimal  policies  from  the  two  models.   Furthermore,  there  are  two  types  of 
comparisons  that  we  can  make  between  the  models:   one  is  quantitative  and 
the  other  is  qualitative. 

A  direct  quantitative  comparison  of  the  optimal  policies   obtained 
from  the  two  formulations  is  impossible:   on  the  one  hand  for  the  deterministic 


The  only  papers  known  to  the  authors  in  which  a  quantitative  comparison 
between  results  for  deterministic  and  stochastic  optimal  control  problems 
is  made  are  [48]  and  [49].   In  both  papers  the  state  space  is  continuous  in 
the  stochastic  problem. 
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model  we  have  a  piecewise  dif f erentiable  battle  trajectory,  while  on  the 
other  hand  for  the  stochastic  model  we  have  a  discontinuous  Markov  process 
describing  the  force  levels.   Thus,  we  have  <p n(t,x ,  ,x?,y)   for  the  deter- 
ministic formulation  with  x  ,   x  ,   and  y  varying  continuously  over  time, 

and  we  have  <J>  (t,m  ,m  ,n)   for  the  stochastic  formulation  with  m  ,  m  , 

o     J.   Z  _L     Z 

and  n  restricted  to  be  non-negative  integers  and  casualties  occurring 

randomly  as  a  Markov  jump  process.   The  impossibility  of  directly  comparing 

*  * 

4>  (t  ,x.  ,x  ,y)   and  <f>  (t,ra  ,m  ,n)   continuously  over  time  should  be  apparent. 

Nevertheless,  we  can  still  qualitatively  compare  the  structures  of 

* 
the  two  policies.   There  is,  however,  a  difficulty  in  that  ^  (t,m  ,ra  ,n) 

represents  a  conditional  policy,  i.e.  the  optimal  policy  given  that  the 

system  is  in  state   (m  ,m  ,n)   with  "backwards  time"   t   remaining  in  the 

battle.   When  a  state  transition  occurs  (randomly)  to   (m',m',n'),   then 

the  optimal  policy  accordingly  becomes  <£  (x  ,m'  m'  ,n')  .   In  comparing  the 

optimal  policies  this  should  be  taken  into  account,  since  it  does  not  seem 

*k  /    o   o  o     o 

appropriate  to  compare  <j)  (x,m  ,m  ,n»)   with  m  ,   m  ,   and  n~   held  con- 

* 
stant  to  <J)  (t,x.  ,x  ,y)   with  x,  ,   x  ,   and  y  changing  (continuously) 

over  time.   Since  for  the  stochastic  formulation  it  does  not  make  sense 

to  consider  an  "average"  optimal  policy  or  the  optimal  policy  for  "average" 

force  levels,  for  comparison  with  the  optimal  policy  for  the  deterministic 

formulation  we  have  considered  a  realization  of  the  stochastic  attrition 

process  in  which  the  force  levels  are  always  "near  to"  those  of  the  corre- 

sponding  deterministic  process.   In  other  words,  we  will  compare  <j>  (t,x  ,x  ,y) 

to  <j> c(T,m  ,m  ,n)   at  selected  values  of  x  ,   x  ,   and  y.  The  force  levels 

in  the  deterministic  model  are  rounded  to  integers  to  yield  the  values  of 

m  ,   m  ,   and  n  as  follows:   m  ■  [x  ]  +  1   (and  m  =  0  when  x  =  0) 
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where   [x]   denotes  "the  greatest  integer  in  x,"   i.e.   [3.96]  =  3. 
Moreover,  in  our  comparison  we  will  try  to  use  the  results  obtained  from 
the  deterministic  formulation  to  gain  insight  into  the  behavior  of  the 
optimal  policy  for  the  stochastic  control  problem.   In  other  wof*ok  we 
will  try  to  explain  results  from  the  stochastic  formulation  1/  considering 
the  corresponding  behavior  for  the  deterministic  formulation. 

Numerical  results  have  been  generated  using  two  FORTRAN  programs 

* 
run  on  an  IBM  360-67  computer.   The  program  which  generates  $  (t,x  ,x  ,y) 

(and  also  the  force  level  trajectories)  has  been  discussed  in  Section  4.5. 

* 

The  program  which  generates   <f> Q(t,m  ,m 0,n)   performs  the  computations 

described  in  Section  5.3.   The  program  for  the  stochastic  formulation  is 
limited  by  computer  memory  requirements.   Results  for  all  force  levels  are 
retained  for  two  time  steps.  A  battle  with  m..  =  5,   m  =  5,   and  n„  =  5 
requires  200,000  bytes  of  computer  memory,  and  this  increases  exponentially 
with  the  force  levels  as  Table  IV  indicates.   Thus,  most  runs  of  the  computer 
program  for  the  stochastic  formulation  have  been  with  the  above  as  the 
upper  limit  for  initial  force  levels,  although  we  have  run  one  case  with 
m  =  9,   in  =  9 ,   and   n~  ■  9  which  required  nearly  2,000,000  bytes  of 
memory. 

The  above  computer  programs  have  been  run  for  over  fifteen  different 
"parameter  sets,"  typical  examples  of  which  are  shown  in  Table  V.   In  all 
cases  we  have  chosen  parameter  values  so  that   a-iD-,  >  a^nm      The  optimal 
policies  for  the  deterministic  and  the  stochastic  formulations  have  been 
compared  as  discussed  above.   The  results  of  these  comparisons  will  now  be 
summarized. 


This  is  done  so  that  an  interval  process  (time  between  casualties)  of  the 
casualty  process  will  be  "similar"  in  the  deterministic  and  stochastic 
formulations . 
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Table  V.   Parameter  Sets  Used  to  Generate  Numerical 
Results  Given  in  Tables  VI  through  VIII. 

Parameter 

Set  &  a_2  b       b  2.  1  L 

1  0.025  0.015  0.035  0.005  0.75  2.25  2.0 

2  0.005  0.003  0.007  0.001  0.15  0.45  0.4 

3  0.085  0.080  0.03  0.03  1.0  2.0  2.0 

Note:   For  all  the  above  parameter  sets  we  have  a  b  >  a  b   and  a^  <  a  q. 

The  first  thing  to  be  pointed  out  is  that  the  optimal  fire  distribu- 

* 

tion  policy  for  both  formulations  has  the  property  that   <J>    is  either 

0  or  1  (almost  everywhere  in  time) .    For  the  deterministic  formulation, 
we  have  shown  [34]  that  a  singular  solution  is  impossible  and  that  <f> 
must  be  0  or  1  except  for  at  most  one  point  in  time.   Although  we  have  not 
proved  such  a  result  for  the  stochastic  formulation,  we  have  never  encountered 
any  exception  to  it  in  all  our  numerical  computations.   As  we  have  discussed 
above,  two  cases  must  be  distinguished: 

Case  (1)     a  p  2s  a  q, 

Case  (2)     a  p  <  a~q. 

For  Case  (1):   a  p  k  a9q,   the  optimal  policy  is  apparently  identical 
for  both  formulations:   <J>  (t,x  ,x  ,y)  =  <J>  (t,m  ,m  ,n)  =  1   for   x  >  0 
(or  m  >  0) .   We  recall  that  this  result  has  been  proved  for  the  determi- 
nistic formulation.   Although  a  proof  has  not  been  found,  it  apparently 
is  also  true  for  the  stochastic  formulation.   No  exception  has  been  encoun- 
tered in  all  the  cases  for  which  numerical  determinations  have  been  made. 


See  [36]  for  a  discussion  of  why  this  is  so  and  for  an  example  of  a  similar 
problem  with  a  different  attrition  process  for  which  <f>*  may  take  on  an 
intermediate  value,  i.e.   0  <  <J>*  <  1   (see  also  [38]). 
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For  Case  (2):   a  p  <  a„q,   the  optimal  policies  are  similar  but  not 
identical.   The  basic  structures  are  apparently  essentially  the  same.   As 
discussed  above,  the  two  policies  have  been  compared  at  selected  points 
along  a  deterministic  trajectory  by  considering  a  corresponding  realization 
of  the  stochastic  process  obtained  by  rounding  the  deterministic  force 
levels.   The  time  of  such  a  comparison  is  rounded  up  to  the  next  whole 
minute  in  the  case  of  the  occurrence  of  a  casualty  and  to  the  next  0.01 
minute  in  the  case  of  a  switch  in  fire  distribution.   Cases  corresponding 
to  over  ten  parameter  sets  have  been  considered;  illustrative  examples  of 
such  parameter  sets  are  shown  in  Table  V. 

In  Table  VI  we  show  some  typical  comparisons.  Although  not  shown 

in  Table  VI,  it  should  be  noted  that  in  all  the  cases  numerically  computed 

* 
4>  (x,m  ,m  ,n)   had  the  property  that  for  constant  m  ,  m  ,   and  n 

it  it 

<t>s(x,m  ,m  ,n)  =  0   for  0  £.   x  <  x   and  4>s(x,m  ,m2,n)  -  1  for  x^   >  x 
where   x   denotes  the  "backwards  time."   In  Table  VI  we  show  the  optimal 
policies  for  the  two  formulations  for  two  parameter  sets.   The  optimal 
policies  are  given  at  discrete  points  in  time  following  the  above  discussion, 
These  times  correspond  to  a  switching  time  in  one  of  the  formulations  or 
the  occurrence  of  a  casualty  in  the  "typical"  realization  of  the  stochastic 
process.   The  deterministic  force  levels   x  ,   x  ,   and  y  from  which 
m1  ,   m  ,   and   n  have  been  determined  are  not  shown  in  Table  VI.   The 
optimal  returns  for  the  two  formulations  are  also  shown. 

The  results  shown  in  Table  VI  are  typical  and  indicate  (at  least 
for  all  the  cases  so  far  computed)  that  there  is  no  fundamental  difference 
between  the  structures  of  the  two  optimal  policies,  at  least  where  the 
deterministic  battle  does  not  terminate  prematurely,  i.e.  tf   -   t    • 


Thus,  these  remarks  apply  to  cases  in  which  optimal  deterministic  trajectories 

lead  to  terminal  states   S,    S_,   and   S„. 

1»    2         3 
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Table  VI.    Comparisons  of  Results  from  Deterministic 

and  Stochastic  Optimal  Control  Problems 
(Deterministic  Trajectory  Leads  to  Terminal  State  SI) 


Parameter  Set  1 


Elapsed  'Lime,  t  Force  Levels  Deterministic 


(minutes) 

m 

m2 
5 

n 

-4& 

1 

*2»J°    °l 

3t 

Lraal  Ret 

urn 

+*< 

it.m    ,m„,n) 

S(t,m    ,m.^,n) 
-8.93 

0 

2 

3 

-10.95 

■j 

1 
1 

i. 

13 

2 

5 

2 

1 

-10.95 

1 

-11.16 

18 

1 

5 

2 

1 

-10.95 

1 

-9.12 

31 

1 

5 

1 

1 

-10.95 

1 

-10.96 

35.39 

1 

5 

1 

1 

-10.95 

0 

-10.79 

41.28 

1 

5 

1 

0 

-10.95 

0 

-10.54 

50=t   =t 

t      max 

1 

5 

1 

0 

-10.95 

0 

-10.00 

Parameter 

Set   2 

Elapsed  Time,  t  Force  Levels 


Deterministic 


(minutes) 


0 

27 

50 

55 

56 

56.38 

87 

100-t  =t 

t   max 


5 
5 
4 
4 
4 
4 
4 
4 


m2     n        $*(t,x    ,x    ,y)    Optimal  Return  <frg<t  tm    ,m?  ,n)    SCt.m^m    ,n) 


-2.06 
•2.06 
-2.06 
-2.06 
-2.06 
-2.06 
-2.06 
-2.06 


X    ■     L     • 

x      <c 

1 

-0.62 

1 

-2.17 

1 

-1.67 

0 

-1.64 

0 

-2.06 

0 

-2.05 

0 

-2.18 

0 

-2.05 

Parameter  Set  2 


Elapsed  Time,    t 

Force  Levels 

(minutes) 

m 

m0 

n 

4*iL 

1 

X2- 

,y) 

t*(t 

,m- ,m_fn) 

1 

0 

l 
5 

i. 
5 

5 

B 

5.61 

5 

5 

5 

1 

0 

6.38 

5 

5 

5 

0 

0 

26 

5 

5 

4 

0 

0 

50=t   =t 

f      max 

5 

5 

4 

0 

0 

A3 


* 

The  reader  should  note  that   <j>   changes  somewhat  earlier  in  forward  time 

from  1  to  0  than  does   $    (at  least  for  the  realization  of  the  stochastic 

process  considered  here). 

In  cases  in  which  the  deterministic  battle  ends  prematurely  (i.e. 

the  optimal  trajectory  leads  to   S.  ,   Sc,   S,,   or   S_)   more  pronounced 

h  j  o        / 

quantitative  differences  may  occur.   This  is  illustrated  by  the  cases  shown 

in  Table  VII.   As  noted  above,  the  deterministic  trajectory  determines  at 

which  values  of  m  ,   m  ,   and   t  we  look  at   <j>  .   This  should  explain 

to  the  reader  why  the  stochastic  results  shown  in  Table  VII  are  not  realizable. 

Thus,  for  the  first  battle  shown  in  Table  VII,  a  realization  of  the  stochastic 

battle  would  evolve  differently  (in  structure)  than  the  deterministic  battle 

due  to  this  difference  in  the  optimal  controls.   The  authors  feel  that  this 

is  due  to  the  fact  that  Y  marginally  wins  the  deterministic  battle,  and 

thus  in  the  stochastic  model  there  is  a  fairly  good  probability  at   t 

much  less  than   t     that   Y  will  lose  the  battle.   In  other  words,  there 
max 

are  some  possible  probabilistic  trajectories  which  yield  a  reduced  payoff 
to  Y.   These  are  weighted  in  the  stochastic  decision  process,  and  Y   con- 
sequently follows  a  more  conservative  policy  for  the  stochastic  formulation. 
For  the  case  of  the  first  battle  shown  in  Table  VII,   Y   essentially  gives 
up  his  chances  of  winning  to  guarantee  a  given  level  of  return.   This 
phenomenon  is  similar  to  the  "flypaper  effect"  noted  by  Whittle  [48J  in 
certain  stochastic  optimal  control  problems.   In  the  second  battle  shown, 
Y   achieves  a  clear-cut  victory  in  the  deterministic  battle,  and  this 
phenomenon  does  not  occur. 


A  transition  from   (m.  ,m.,n)  =  (3,5,5)   to   (2,5,5)   is  impossible  when  <f>   =  0 
n  1   2  b 

This  probability  has  not  been  explicitly  determined. 
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Table  VII.    Comparisons  of  Results  from  Deterministic 

and  Stochastic  Optimal  Control  Problems 
(Deterministic  Trajectory  Leads  to  Terminal  State  S7) 


Parameter  Set  3 


max 


=  50  minutes 


Elapsed 

Time,  t 

Force  Levels 

(minutes) 

ml 
3 

m2- 
5 

n 

0 

5 

3 

2 

5 

5 

5 

2 

5 

4 

6 

1 

5 

4 

8. 

59 

0 

5 

4 

11 

0 

5 

3 

13 

0 

4 

3 

18 

0 

3 

3 

21 

0 

3 

2 

24 

0 

2 

2 

31 

0 

1 

2 

40. 

l=t£ 

0 

0 

2 

Ag(t»x1.i*2-t^l      fs*t7?tl>lVn) 


0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 


Parameter  Set  3 


Elapsed  Time,  t 

Force  Levels 

,50* 

,40* 

,30* 

20* 

(minutes) 

m 

m0 

n 

4* 

1 

1 

1 

1 

0 

0 

l 
2 

3 

5 

3 

1 

3 

5 

1 

1 

1 

0 

0 

5.04 

0 

3 

5 

0 

0 

0 

0 

0 

8 

0 

2 

5 

0 

0 

0 

0 

0 

11 

0 

1 

5 

0 

0 

0 

0 

0 

14 

0 

1 

4 

0 

0 

0 

0 

0 

14.11»tr 

0 

0 

4 

0 

0 

0 

0 

0 

Note:   <j) 


40* 


denotes   d>*  (t  ,ra.  ,m~  ,n)   computed  with   t 

S     1   2  max 


40  minutes 
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In  addition,  in  cases  in  which  there  is  a  premature  termination  in 
the  deterministic  formulation,  the  optimal  policy  for  Y   in  the  correspond- 
ing stochastic  problem  is  affected  by  the  length  of  the  "perceived  planning 
horizon."   This  effect  is  shown  in  the  data  for  the  second  battle  of  Table 
VII  in  which  optimal  policies  are  given  for  stochastic  battles  of  varying 
lengths.   We  see  that  when  the  deterministic  battle  ends  near  to  the 
scheduled  end  of  the  stochastic  battle,   Y   follows  a  more  conservative 
policy  in  the  stochastic  battle.   Since  there  is  some  chance  that  Y   cannot 
annihilate  the  X  forces  in  the  "perceived  length  of  battle,"  he  follows 
a  conservative  policy  of  firing  at  X9 .   This  might,  in  fact,  explain  the 
results  for  the  first  battle.   Other  similar  phenomena  have  been  encountered 
in  cases  not  shown  here. 

Finally,  in  Table  VIII  we  show  that  the  optimal  policy  followed  by 
Y   in  a  realization  of  the  stochastic  combat  process  may  differ  appreciably 
from  that  for  the  deterministic  formulation  if  the  realization  does  not 

follow"  the  deterministic  trajectory.   It  is  seen  that  <J>S  may  repeatedly 
switch  back  and  forth  from  0  to  1  for  certain  realizations  of  the  stochastic 
process.   This  is  quite  different  than  the  corresponding  behavior  for  the 
deterministic  version. 

7 .   Discussion. 

In  this  section  we  discuss  what  we  have  learned  from  the  above  com- 
parison.  First  and  foremost,  the  authors  feel  that  the  deterministic 
formulation  provides  more  insight  into  the  structure  of  the  optimal  fire 
distribution  policy.   The  explicit  dependence  of  the  optimal  control  upon 
various  parameter  groups  (these  are  (1)   R  =  a..b  /(a_b  ),   (2)   <S  =  a  p/(a  q), 
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Table  VIII.  One  Possible  Dependence  of  Optimal  Stochastic 
Control  on  Realization  of  Casualties  in 
Stochastic  Lanchester  Attrition  Process 
(Deterministic  Trajectory  Leads  to  Terminal  State  S7;  See  Table  VII.) 


Parameter  Set  3 


50  minutes 


Elapsed  Time,  t 

Force  Le 

vels 

(minutes) 

ml 

m2 

n 

4>*(t 

1m1;m2,n) 

0 

3 

5 

5 

0 

0.5 

3 

4 

5 

0 

0.7 

3 

3 

5 

1 

10.0 

2 

3 

5 

1 

15.0 

2 

3 

4 

0 

20.0 

2 

2 

4 

1 

23.55 

2 

2 

4 

0 

24.0 

2 

2 

3 

0 

25.0 

2 

1 

3 

1 

26.0 

1 

1 

3 

1 

30.0 

1 

1 

2 

0 

35.0 

1 

0 

2 

1 
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r  A711 
and  (3)   a  =  — / —  )   is  readily  obtained  for  the  deterministic  optimal 

control  problem.   This  has  not  been  true  for  the  stochastic  problem  for 

which  only  the  dependence  upon  5  has  been  analytically  obtained. 

Let  us  now  summarize  the  observed  differences  and  similarities 

between  the  structures  of  the  optimal  policies  for  the  deterministic  and 

stochastic  formulations.   The  similarities  are:   (1)   optimal  policy  always 

0  or  1,   (2)   same  parameter  groups   (R,<S,   and  a)   upon  which  optimal 

policy  depends,  (3)   optimal  policy  dependent  upon  force  levels  and 

whether  Y  wins  or  loses,  (4)   in  both  models   <f>   =1   for  x1  >  0  when 

* 

6  i  1  and  R  >  1,   and  (5)   <fr  =0  for   t  €  (T-t  ,T]   when  0  s:  6  <  1  <  R; 

furthermore  t.  ■  x1 (a) .   The  differences  are:   (1)   in  the  stochastic 
formulation  the  optimal  policy  actually  implemented  (i.e.  followed)  in  a 
battle  depends  upon  the  battle's  probabilistic  (forward)  evolution  (i.e. 
the  realization  of  the  stochastic  process)  and  the  time  remaining  in  the 
prescribed  duration  battle,  and  (2)   t,   is  "greater  in  the  stochastic 
model"  except  for  cases  corresponding  to  premature  termination  in  the 
deterministic  battle.   Overall,  we  feel  that  an  understanding  of  the 
structure  of  an  optimal  policy  is  best  developed  by  considering  the 
deterministic  version  of  such  a  combat  problem.   For  problems  too  complex 
for  analytic  treatment,  rules  of  thumb  for  approximating  an  optimal  policy 
are  probably  best  obtained  from  deterministic  formulations. 


In  [34]  and  [36]  one  can  find  further  discussion  of  the  structure  of  the 
optimal  policy,  including  interpretation  of  such  parameter  groups.   The  reader 
may  find  the  following  interpretations  useful  for  understanding  the  solution 
to  the  problem  studied  in  the  paper  at  hand.   The  quantity  a  b^     may  be  thought 
of  as  the  rate  of  destroying  X  's   kill  capability  against  Y.   It  is  a  measure 
of  strategic  (long  run)  return.   The  quantity  a  p  represents  the  rate  of  de- 
struction of  X   value  by  Y   at  the  end  of  battle.   Thus,  it  represents  short 
run  return.   The  quantity  r/bT  reflects  the  loss  of  Y  value  at  the  end  of 
battle  so  that   a  measures  the  loss  of  Y  value  relative  to  that  of  X2   at 
the  end  of  battle. 

Moreover,   t,   depends  upon  m  ,m  ,   and  n   in  the  stochastic  optimal  control 
problem. 
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Finally,  we  would  like  to  point  out  that  there  is  a  circumstance 
under  which  the  stochastic  formulation  is  to  be  preferred  over  the  deter- 
ministic one.   This  is,  namely,  when  there  is  a  small  number  (approximately 
three  or  under)  of  each  combatant  type.   As  noted  above,  obtaining  a 
numerical  approximate  solution  to  the  optimal  stochastic  control  problem 
is  limited  to  small  numbers  of  combatants  due  to  computer  memory  require- 
ments.  In  such  cases,  however,  of  small  numbers  of  combatants  (and  a 
stochastic  attrition  process) ,  the  stochastic  formulation  as  a  Markov  chain 
is  to  be  preferred  when  the  required  computer  resources  are  available  for 
the  obvious  reason  that  the  deterministic  differential  equation  model 
cannot  adequately  describe  the  situation.   This  point  made  comparison  of 
results  from  the  two  formulations  difficult. 

8 .   Implications  for  Defense  Planners. 

The  authors  feel  that  the  study  of  even  the  very  simplest  abstractions 
(idealizations)  of  tactical  allocation  structures  as  considered  in  this 
paper  has  yielded  significant  implications  for  defense  planners  and 
military  operations  analysts.   First  and  foremost  is  the  fact  that  study 
of  such  deterministic  optimal  control  problems  provides  much  more  insight 
into  the  structure  of  optimal  allocation  policies  than  corresponding  stochastic 
formulations.   We  feel  that  such  deterministic  formulations  provide  a  better 
understanding  of  the  effects  of  modelling  assumptions  on  optimal  military 
strategies  derived  from  the  mathematical  models.   This  is,  of  course, 
essential  for  determining  optimal  (or  near-optimal)  solutions  to  real  world 
problems  that  are  far  too  complex  to  be  solved  by  exact  analytic  methods. 


These  grow  exponentially  as  force  levels  increase  because  of  the  way  in 
which  a  solution  must  be  "built  up."   See  Figure  1  and  Table  IV  for  illus- 
trations of  this  point. 
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Moreover,  one  might  apply  general  principles  or  rules  of  thumb  developed 
from  the  study  of  such  idealizations  to  higher  resolution  studies  which, 
for  example,  might  use  computer  simulation  methods. 

The  study  of  the  deterministic  optimal  control  problem  (7)  in  this 
paper  yields  several  significant  results  which  should  be  kept  in  mind  by 
practitioners  who  perform  more  detailed  computer  simulation  studies. 
These  are 

(1)  Force  levels  do  affect  optimal  strategies.   Whether  one  "wins"  or 
"loses"  affects  optimal  strategies. 

(2)  Even  the  nature  of  the  scenario  (terminal  control  or  prescribed  dura- 
tion conflict)  may  affect  optimal  strategies.   This,  if  one  develops 
"good"  tactics  for  a  90  day  compaign,  such  tactics  need  not  be  "good" 
if  the  conflict  does  not  terminate  at  the  prescribed  time. 

(3)  The  nature  of  the  attrition  process  has  a  significant  effect  upon 
optimal  strategies. 

Finally,  the  authors  feel  that  the  above  results  indicate  that  more 
basic  research  should  be  done  on  the  termination  of  battles  and  wars  as 
well  as  combat  attrition  theories.   The  demonstrated  sensitivity  of  results 
obtained  from  optimization  problems  like  the  one  considered  here  shows 
this. 


1  This  result  has  been  pointed  out  elsewhere  [36],  [38]  and  is  partially 
based  on  the  study  of  a  similar  problem  [38]. 

"Some  work  has  been  done  in  this  direction  [14],  [33],  [46],  [47],  although  it 
does  not  appear  to  be  widely  known  among  practicing  analysts. 
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APPENDIX.   Explanation  of  Notation. 


The  symbols  which  are  used  in  this  paper  are  defined  as  follows: 

a.,a9,b  ,b»  ■  constant  attrition-rate  coefficients, 

A(m,n) ,B(m,n)  =  attrition  rates  of  X  and  Y   forces,  respectively,  in 
stochastic  battle;  it  should  be  noted  that 


Prob 


one  X  casualty  in 
interval  from  t   to 


t^At]  -  A<m'n>At> 


E    [•]  =  conditional  expectation  (mathematical  expectation  of  quantity 
~,T      in  brackets  at  t  -  0  given  that  at  t  we  have  m(t)  = 
(it^Ct)  ,m2(T)  ,n(x))) , 

H  =  Hamiltonian  function, 

M  (t) ,M  (t) ,N(t)  =  the  numbers  (a  random  variable)  of   X  ,   X„ ,   and  Y 

combatants,  respectively,  at  time  t, 

m..  ,m  ,n  =  realizations  of  the  random  variables  M  (t)  ,  M  (t)  ,  and  N(t); 
initial  values  denoted  as  vo9 ,   m° ,   n~, 

p,q,r  =  utilities  assigned  to  surviving  X..  ,   X   and  Y   forces 
respectively, 

p.(t)   for   i  =  1,2,3,  =  dual  variable  corresponding  to  x.(t) 
1  (x3(t)  =  y(t)), 

£=  ^pi»P2,p3^   (a  vector), 

P(t,m,n)  =  Prob[M(t)=m,N(t)=n]  =  state  probability, 

o     o   o 
P  =  (x  tx    ,y   )    =  point  in  the  initial  state  space, 

R  =  a1b1/(a2b2), 

SCxjin.  ,m  ,n)  =  optimal  expected  value  function, 

S  =  numerical  approximation  to   S (x,m1 ,m9 ,n) , 

S.   for   i  =  1,...,8  =  the   i —  part  of  the  terminal  surface  as  defined 

in  Table  1. 
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s  =  s(x1>x2)  -  b^x  +  t>2x  , 

t  =  time  after  beginning  of  battle, 

t,  =■  time  at  which  X   is  annihilated,  i.e.   x-i(t-i)  =  °» 

t„  =  first  time  at  which  2b  x  (tjx.  +  b„(x2)2  «  a_y2(t  )   for  an 
extremal  leading  to   S-, 

t„  =  last  time  at  which  fire  is  directed  at  X   for  an  extremal  leading 
to  S3, 

t,  =  time  at  which  X~   is  annihilated  (before  X.),  i.e.   x  (t.)  -  0, 

for  an  extremal   leading  to  S0, 

o 

tf  =  time  at  which  battle  ends, 

t    =  maximum  possible  duration  for  battle,  i.e.   tr  £  t    , 

max  r  f    max' 

v  =  v(t)  =  a2P2(x)  -  a^Ct), 

W(x,m  ,m  ,n)  ■  "switching  function"  defined  by  equation  (44), 

o  o 
x-i>x9»y  =  average  force  strengths;  with  initial  values  x1,x2,y„, 


=  coshVa0b_  T   (S.) 


R-6 


2  2   1  v  4'    R-l  * 

qv  a2  ' 

6  =  aip/(a2q) , 

n. (t)   for   i  =  1,2,  =  multiplier  corresponding  to  state  variable 

inequality  constraint  x  ^  0, 

v.   for   i  =  1,2,  =  multiplier  corresponding  to  state  variable  terminal 

inequality  constraint  x.(T)  k  0, 

4>  (<f>  )  =  fraction  of  Y-fire  directed  at  X   in  deterministic  (stochastic) 
formulation;  extremal  and  optimal  controls  denoted  as  ^(^g)  > 
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$  =  { 0  ,  ,  .  , — t-t- 7— r — ,1}  =  set  of  admissible  controls  in  stochastic 

n(t)  n(t)      n(t)  .- 

problem, 

x  =  "backwards  time"  from  the  end  of  battle  defined  by   x  =  tf  -  t,   i.e. 
the  time  remaining  before  the  end  of  battle, 

tn(S.)  =  "backwards  time"  of  the  first  switch  in  tactics  for  extremals 
1  i 

leading  to  S  . 


Additionally,  remarks  similar  to  those  for  t,  (S  )   above  apply  to 
t-^sp,  tf(Si),   etc. 
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